Alignment Model Adaptation for Domain-Specific Word Alignment
نویسنده
چکیده
This paper proposes an alignment adaptation approach to improve domain-specific (in-domain) word alignment. The basic idea of alignment adaptation is to use out-of-domain corpus to improve in-domain word alignment results. In this paper, we first train two statistical word alignment models with the large-scale out-of-domain corpus and the small-scale in-domain corpus respectively, and then interpolate these two models to improve the domain-specific word alignment. Experimental results show that our approach improves domain-specific word alignment in terms of both precision and recall, achieving a relative error rate reduction of 6.56% as compared with the state-of-the-art technologies.
منابع مشابه
Alignment Model Adaptation for Domain-Specific Word Alignment
This paper proposes an alignment adaptation approach to improve domain-specific (in-domain) word alignment. The basic idea of alignment adaptation is to use out-of-domain corpus to improve in-domain word alignment results. In this paper, we first train two statistical word alignment models with the large-scale out-of-domain corpus and the small-scale in-domain corpus respectively, and then inte...
متن کاملImage alignment via kernelized feature learning
Machine learning is an application of artificial intelligence that is able to automatically learn and improve from experience without being explicitly programmed. The primary assumption for most of the machine learning algorithms is that the training set (source domain) and the test set (target domain) follow from the same probability distribution. However, in most of the real-world application...
متن کاملAlignment Inference and Bayesian Adaptation for Machine Translation
We propose a flexible and efficient domain adaptation method that yields consistent improvements in machine translation (for 11 language pairs). The idea is to decompose the word alignment process into two steps, model training and alignment inference, and perform Bayesian adaptation on the latter. This modularity allows one to incorporate out-of-domain data without the need to modify existing ...
متن کامل: Improving Domain-Specific Word Alignment with a General Bilingual Corpus
In conventional word alignment methods, some employ statistical models or statistical measures, which need large-scale bilingual sentencealigned training corpora. Others employ dictionaries to guide alignment selection. However, these methods achieve unsatisfactory alignment results when performing word alignment on a small-scale domain-specific bilingual corpus without terminological lexicons....
متن کاملSequence segmentation for statistical machine translation
In the last decade, while statistical machine translation has advanced significantly, there is still much room for further improvements relating to many natural language processing tasks such as word segmentation, word alignment and parsing. Human language is composed of sequences of meaningful units. These sequences can be words, phrases, sentences or even articles serving as basic elements in...
متن کامل